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Abstract 

We study the makespan minimization problem with unrelated selfish machines under the assump- 
tion that job sizes are stochastic. We design simple truthful mechanisms that under various distribu- 
tional assumptions provide constant and sublogarithmic approximations to expected makespan. Our 
mechanisms are prior-independent in that they do not rely on knowledge of the job size distributions. 
Prior-independent approximation mechanisms have been previously studied for the objective of revenue 
maximization [13, 11, 26]. In contrast to our results, in prior-free settings no truthful anonymous deter- 
ministic mechanism for the makespan objective can provide a sublinear approximation ||3|. 
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1 Introduction 



We study the problem of scheduling jobs on machines to minimize makespan in a strategic context. The 
makespan the longest it takes any of the machines to complete the work assigned by the schedule. The 
running time or size of a job on a machine is drawn from a fixed distribution, and is a private input known 
to the machine but not to the optimizer. The machines are unrelated in the sense that the running time of 
a job on distinct machines may be distinct. A scheduling mechanism solicits job running times from the 
machines and determines a schedule as well as compensation for each of the machines. The machines are 
strategic and try to maximize the compensation they receive minus the work they perform. We are interested 
in understanding and quantifying the loss in performance due to the strategic incentives of the machines who 
may misreport the job running times. 

A primary concern in the theory of mechanism design is to understand the compatibility of various ob- 
jectives of the designer with the incentives of the participants. As an example, maximizing social welfare is 
incentive compatible; the Vickrey-Clarke-Groves (VCG) mechanism obtains this socially optimal outcome 
in equilibrium |[27l l9l [TSll. For most other objectives, however, the optimal solution ignoring incentives 
(a.k.a. the first-best solution) cannot be implemented in an incentive compatible manner. This includes, 
for example, the objectives of revenue maximization, welfare maximization with budgets, and makespan 
minimization with unrelated machines. For these objectives there is no incentive compatible mechanism 
that is best on every input. The classical economic approach to mechanism design thus considers inputs 
drawn from a distribution (a.k.a. the prior) and looks for the mechanism that maximizes the objective in 
expectation over the distribution (a.k.a. the second-best solution). 

The second-best solution is generally complex and, by definition, tailored to specific knowledge that 
the designer has on the distribution over the private information (i.e., the input) of the agents. The non- 
pointwise optimality, complexity, and distributional dependence of the second-best solution motivates a 
number of mechanism design and analysis questions. 

price of anarchy: For any distribution over inputs, bound the gap between the first-best (optimal without 
incentives) and second-best (optimal with incentives) solutions (each in expectation over the input). 

computational tractability: For any distribution over inputs, give a computationally tractable implemen- 
tation of the second-best solution, or if the problem is intractable give a computationally tractable 
approximation mechanism. 

simplicity: For any distribution over inputs, give a simple, practical mechanism that approximates the 
second-best solution. 

prior independence: Give a single mechanism that, for all distributions over inputs, approximates the 
second-best solution. 

These questions are inter-related. As the second-best mechanism is often complex, the price of anarchy 
can be bounded via a lower bound on the second-best mechanism as given by a simple approximation 
mechanism. Similarly, to show that a mechanism is a good approximation to second-best the upper bound 
given by the first-best solution can be used. Importantly though, if the first-best solution does not permit 
good approximation mechanisms then a better bound on the second-best solution should be sought. Each 
of the questions above can be further refined by consideration with respect to a large class of priors (e.g. 
identical distributions). 
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The prior-independence question gives a middle ground between worst-case mechanism design and 
Bayesian mechanism design. It attempts to achieve the best of both worlds in the tradeoff between infor- 
mational efficiency and approximate optimality. Its minimal usage of information about the setting makes 
it robust. A typical side-effect of this robustness is simple and natural mechanisms; indeed, our prior- 
independent mechanisms will be simple, computationally tractable, and also enable a bound on the price of 
anarchy. 

The literature on prior-independent mechanism design has focused primarily on the objective of rev- 
enue maximization. Hartline and Roughgarden [17] show that with sufficient competition, the welfare 
maximizing (VCG) mechanism also attains good revenue. This result enables the prior-independent ap- 
proximation mechanism for single-item auctions of Dhangwatnotai, Roughgarden, and Yan |[i3 ] and the 
multi-item approximation mechanisms of Devanur et al. [11] and Roughgarden et al. |26l|. Importantly, 
in single-item auctions the agents' private information is single-dimensional whereas in multi-item auc- 
tions it is multi-dimensional. There are several interesting and challenging directions in prior-independent 
mechanism design: (1) non-linear objectives, (2) general multi-parameter preferences of agents, (3) non- 
downwards-closed feasibility constraints, and (4) non-identically distributed types of agents. Our work 
addresses the first three of these four challenges. 

We study the problem of scheduling jobs on machines where the runtime of a job on a machine is that 
machine's private information. The prior over runtimes is a product distribution that is symmetric with 
respect to the machines (but not necessarily symmetric with respect to the jobs). Ex ante, i.e., before the 
job sizes are instantiated, the machines appear identical; ex post, i.e., after the job sizes are realized, the 
machines are distinct and job runtimes are unrelated. The makespan objective is to schedule the jobs on 
machines so as to minimize the time at which the last machine completes all of its assigned jobs. Our goal 
is a prior-independent approximation of the second-best solution for the makespan objective. 

To gain intuition for the makespan objective, consider why the simple and incentive compatible VCG 
mechanism fails to produce a good solution in expectation. The VCG mechanism for scheduling minimizes 
the total work done by all of the machines and accordingly places every job on its best machine. Note that 
because the machines are a priori identical, this is an i.i.d. uniformly random machine for every job. There- 
fore, in expectation, every machine gets an equal number of jobs. Furthermore, every job simultaneously 
has its smallest size possible. However, the maximum load in terms of the number of jobs per machine and 
so also the makespan can be quite large. The distribution of jobs across machines is akin to the distribution 
of balls into bins in the standard balls-in-bins experiment — when the number of balls and bins is equal, the 
maximum loaded bin contains 6 (log n/ log log n) balls with high probability even though the average load 
is 1. 

Our designed mechanism must prevent the above balls-in-bins style behavior. Consider a variant of 
VCG that we call the bounded overload mechanism. The bounded overload mechanism minimizes the 
total work with the additional feasibility constraint that the load (i.e., number of jobs scheduled) of any 
machine is bounded to be at most a c factor more than the average load. This mechanism is "maximal in 
range", i.e., it is simply the VCG mechanism with a restricted space of feasible outcomes; it is therefore 
incentive compatible. Moreover, the bounded overload mechanism can be viewed as belonging to a class of 
"supply limiting" mechanisms (cf. the prior-independent supply-limiting approximation mechanism of [|26J 
for multi-item revenue maximization). 

While the bounded overload mechanism evens out the number of jobs per machine, an individual job 
may end up having a running time far larger than that on its best machine. The crux of our analysis is to 
show that this does not hurt the expected makespan of our schedule relative to an ideal setting where every 
job assumes its minimum size. Our analysis of job sizes has two components. First we show that every job 
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with high probability gets assigned to one of its best machines. Second, we show that the running time of 
a job on its ith best machine can be related within a factor depending on i to its running time on its best 
machine. These components together imply that the bounded overload mechanism simultaneously obtains 
a schedule that is balanced in terms of the number of jobs per machine and where every job has a small size 
(in comparison to the best possible for that job). This is sufficient to imply a constant factor approximation 
to expected makespan when the number of jobs is proportional to the number of machines. 

The second component of our analysis of job sizes in the bounded overload mechanism entails relating 
different order statistics of (arbitrary) i.i.d. distributions, a property that may have broader applications. 
In particular, letting X[k:n\ denote the A;th minimum out of n independent draws from a distribution, we 
show that for any k and n, X[A;:n] is nearly stochastically dominated by an exponential function of k times 
X[l:n/2]. In simple terms, the minimum out of a certain number of draws cannot be arbitrarily smaller than 
the kth minimum out of twice as many draws. 

As an intermediary step in our analysis we bound the performance of our approximation mechanism with 
respect to the first-best solution with half the machines (recall, machines are a priori identical). Within the 
literature on prior-independent revenue maximization this approach closely resembles the classical Bulow- 
Klemperer theorem For auctioning k units of a single-item to n agents (with values drawn i.i.d. from a 
"nice" distribution), the revenue from welfare maximization exceeds the optimal revenue from n — k agents. 
In other words, a simple prior-independent mechanism with extra competition (namely, k extra agents) is 
better than the prior-optimal mechanism for expected revenue. Our result is similar: when the number of 
jobs is at most the number of machines and machines are a priori identical, we present a prior-independent 
mechanism that is a constant approximation to makespan with respect to the first-best (and therefore also 
with respect to the second-best) solution with half as many machines. Unlike the Bulow-Klemperer theorem 
we place no assumptions the distribution of jobs on machines besides symmetry with respect to machines. 

To design scheduling mechanisms for the case where the number of jobs is large relative to the number 
of machines we can potentially take advantage of the law of large numbers. If there are many more large 
jobs (i.e., jobs for which the best of the machines' runtimes is significant) then assigning jobs to machines 
to minimize total work will produce a schedule where the maximum work on any machine is concentrated 
around its expectation; moreover, the expected load of any machine in the schedule that minimizes total 
work is at most the expected load of any machine in the schedule that minimizes makespan. 

On the other hand, if there are a moderate number, e.g., proportional to the number of machines, of jobs 
with very large runtimes on all machines, both the minimum work mechanism and the bounded overload 
mechanism can fail to have good expected makespan. For the bounded overload mechanism, although 
the distribution of jobs across machines is more-or-less even, the distribution of the few "worst" jobs that 
contribute the most to the makespan may be highly uneven. Indeed, for a distribution where the expected 
number of large jobs is about the same as the number of machines, the bounded overload mechanism exhibits 
the same bad balls-in-bins behavior as the minimum work mechanism. 

The problem above is that the existence of many small, but relatively easy to schedule jobs, prevents the 
bounded overload mechanism from working. To solve this problem we employ a two stage approach. The 
first stage acts as a sieve and schedules the small jobs to minimize total work and while leaving the large 
jobs unscheduled. Then in the second stage the bounded overload mechanism is run on the unscheduled 
jobs. With the proper parameter tunings (i.e., job size threshold for the sieve and partitioning of machines to 
the two stages) this mechanism gives a schedule with approximately optimal expected makespan. We give 
two parameter tunings and analyses, one which gives an 0{\/log m) approximation and the other that gives 
an 0((log log m)^) approximation under a certain tail condition on the distribution of job sizes (satisfied, 
for example, by all monotone hazard rate distributions). 
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The proper tuning of the parameters of the mechanism require knowledge of a single order statistic of 
the size distribution, namely the expected size of a job on its best out of k machines for an appropriate 
value of k, to decide which jobs get scheduled in which stage. This statistic can be easily estimated as the 
mechanism is running by using the reports of a small fraction of the machines as a "market analysis." To 
keep our exposition and analysis simple, we skip this detail and assume that the statistic is known. 

Related work 

There is a large body of work on prior-free mechanism design for the makespan objective. This work does 
not assume a prior distribution, instead it looks at worst-case approximation of the first-best solution (i.e., 
the optimal makespan without incentive constraints). The problem was introduced by Nisan and Ronen [25 1 
who showed that the minimum work (a.k.a. VCG) mechanism gives an m-approximation to makespan 
(where m is the number of machines). They gave a lower bound of two on the worst case approximation 
factor of any dominant strategy mechanism for unrelated machine scheduling. They conjectured that the 
best worst-case approximation is indeed Q{m). Following this work, a series of papers presented better 
lower bounds for deterministic as well as randomized mechanisms |l8l|7l[T8l|24l. Ashlagi, Dobzinski and 
Lavi (3) recently proved a restricted version of the Nisan-Ronen conjecture by showing that no anonymous 
deterministic dominant-strategy incentive-compatible mechanism can achieve a factor better than m. This 
lower bound suggests that the makespan objective is fundamentally incompatible with incentives in prior- 
free settings. In this context, our work can be viewed as giving a meaningful approach for obtaining positive 
results that are close to prior-free for a problem for which most results are very negative. 

Given these strong negative results, several special cases of the problem have been studied. Lavi and 
Swamy [191 give constant factor approximations when job sizes can take on only two different values. Lu 
and Yu ll22ll2Tll20l consider the problem over two machines, and give approximation ratios strictly better 
than 2. 

Related machine scheduling is the special case where the runtime of a job on a machine is the product 
of the machine's private speed and the job's public length. Importantly, the private information of each 
machine in a related machine scheduling problem is single-dimensional, and the total length of the jobs as- 
signed to any given machine in the makespan minimizing schedule is monotone in the machine's speed. This 
monotonicity implies that the related machine makespan objective is incentive compatible (i.e., the price of 
anarchy is one). For this reason work on related machine scheduling has focused on computational tractabil- 
ity. Archer and Tardos ||2] give a constant approximation mechanism and Dhangwotnotai et al. |[T2l give 
an incentive compatible polynomial time approximation scheme thereby matching the best approximation 
result absent incentives. There are no known approximation -preserving black-box reductions from mecha- 
nism design to algorithm design for related machine scheduling; moreover, in the Bayesian model Chawla, 
Immorlica, and Lucier ||6] recently showed that the makespan objective does not admit black-box reductions 
of the form that Hartline and Lucier |[T6l showed exist for the objective of social welfare maximization. 

Another line of work studies the makespan objective subject to an envy-freedom constraint instead of 
the incentive-compatibility constraint. A schedule and payments (to the machines) are envy free if every 
machine prefers its own assignment and payment to that of any other machine. Mu'alem |[23l introduced 
the envy-free scheduling problem for makespan. Cohen et al. ifTOl gave a polynomial time algorithm for 
computing an envy-free schedule that is an O(logm) approximation to the first-best makespan (i.e., the 
optimal makespan absent envy-freedom constraints). Fiat and Levavi [ 14| complement this by showing that 
the optimal envy-free makespan (a.k.a. second-best makespan) can be an O(logm) factor larger than the 
first-best makespan. 
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2 Preliminaries and main results 



We consider the scheduling of n jobs on m unrelated machines where the running time of a job on a machine 
is drawn from a distribution. A schedule is an assignment of each job to exactly one machine. The load of 
a machine is the number of jobs assigned to it. The load factor is the average number of jobs per machine 
and is denoted rj = n/m. The work of a machine is the sum of the runtimes of jobs assigned to it. The total 
work is the sum of the works of each machine. The makespan is the most work assigned to any machine. 

The vector of running times for each of the jobs on a given machine is that machine's private informa- 
tion. A scheduling mechanism may solicit this information from the machines, may make payments to the 
machines, and must select a schedule of jobs on the machines. A scheduling mechanism is evaluated in the 
equilibrium of strategic behavior of the machines. A particularly robust equilibrium concept is dominant 
strategy equiUbrium. A scheduling mechanism is incentive compatible if it is a dominant strategy for each 
machine to report its true processing time for each job. 

We consider the following simple mechanisms: 

minimum work The minimum work mechanism solicits the running times, selects the schedule to mini- 
mize the total work and pays each machine its externality, i.e., the difference between the minimum 
total work when the machine does nothing and the total work of all other machines in the selected 
schedule. 

bounded overload The bounded overload mechanism is parameterized by an overload factor c > 1 and is 
identical to the minimum work mechanism except it optimizes subject to placing at most or) jobs on 
any machine. 

sieve / anonymous reserve The sieve mechanism, also known as the anonymous reserve mechanism, is 
parameterized by a reserve /3 > and is identical to the minimum work mechanism except that there 
is a dummy machine added with runtime P for all jobs. Jobs assigned to the dummy machine are 
considered unscheduled. 

sieve and bounded overload The sieve and bounded overload mechanism is parameterized by overload c, 
reserve /3, and a partition parameter S. It partitions the machines into two sets of sizes (1 — S)m 
and 6m. It runs the sieve with reserve /3 on the first set of machines and runs the bounded overload 
mechanism with overload c on the unscheduled jobs and the second set of machines. 

The above mechanisms are incentive compatible. The minimum work mechanism is incentive compatible 
as it is a special case of the well known Vickrey-Clarke-Groves (VCG) mechanism which is incentive com- 
patible. The bounded overload mechanism is what is known as a "maximal in range" mechanism and is 
also incentive compatible (by the VCG argument). The sieve / anonymous reserve mechanism is incentive 
compatible because the incentives of the agents in the minimum work mechanism are unaffected by the 
addition of a dummy agent. Finally, the sieve and bounded overload mechanism is incentive compatible 
because from each machine's perspective it is either participating in the sieve mechanism or the bounded 
overload mechanism. 

The runtimes of jobs on machines are drawn from a product distribution (a.k.a., the prior) that is sym- 
metric with respect to the machines. (Therefore, the running times of a job on each machine are i.i.d. random 
variables.) The distribution of job j on any machine is denoted Fj; a draw from this distribution is denoted 
Tj. The best runtime of a job is its minimum runtime over all machines, this first order statistic of m random 
draws from Fj is denoted by Tj[l:m\. 
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Our goal is to exhibit a mechanism that is prior-independent and a good approximation to the expected 
makespan of the best incentive compatible mechanism for the prior, i.e., the second-best solution. Because 
both the second-best and the first-best expected makespans are difficult to analyze, we will give our approx- 
imation via one of the following two lower bounds on the first-best solution. 

expected worst best runtime The expected worst best runtime is the expected value of the best runtime of 
the job with the longest best runtime, i.e., E[maxj Tj[l:m]] 

expected average best runtime The expected average best runtime is the expected value of the sum of the 
best runtimes of each job averaged over all machines, i.e., E[^^ Tj[l:m]]/m. 

Intuitively, the former gives a good bound when the load factor is small, the latter when the load factor is 
large. We will refer to any of these bounds on the first-best makespan as OPT, with the assumption that 
which of the bounds is meant, if it is important, is clear from the context. 

As an intermediary in our analysis of the makespan of our scheduling mechanisms with respect to OPT, 
we will give bicriteria results that compare our mechanism's makespan to the makespan of an optimal 
schedule with fewer machines. This restriction is well defined because the machines are a prior identical. 
For a given parameter 5, OPT5 will denote the optimal schedule with 5m machines (via bounds as described 
above). Much of our analysis will be with respect to OPTx/2. i-e-> the optimal schedule with half the number 
of machines. 

While it is possible to construct distributions where OPT is much smaller than 0PTi/2> for rnany 
common distributions they are quite close. In fact, for the class of distributions that satisfy the monotone 
hazard rate (MHR) condition^ OPT and OPT1/2 are always within a factor of four; more generally OPT 
and OPT^ are within a factor of 1/5^ for these distributions. (See proof in Section |5]) 

Lemma 2.1 When the distributions of job sizes have monotone hazard rates the expected worst best and 
average best runtimes on 6m machines are no more than 1/5^ times the expected worst best and average 
best runtimes, respectively, on m machines. 

2.1 Main Results 

Our main theorems are as follows. When the number of jobs is comparable to the number of machines, i.e., 
the load factor rj is constant, then the bounded overload mechanism is a good approximation to the optimal 
makespan on m/2 machines. 

Theorem 2.2 For n jobs, m machines, load factor r] = n/m, and runtimes distributed according to a 
machine- symmetric product distribution, the expected makespan of the bounded overload mechanism with 
overload c = 7 is a 2007/ approximation to the expected worst best runtime, and hence also to the optimal 
makespan, on m/2 machines. 

Corollary 2.3 Under the assumptions of Theorem \2.2\ where additionally the distributions of job sizes have 
monotone hazard rates, the expected makespan of the bounded overload mechanism with c = 7 is a SOOr? 
approximation to the expected optimal makespan. 

'The hazard rate of a distribution F is given by h{x) = yr^^. where / is the probability density function for F; a distribution 
F satisfies the MHR condition if h{x) is non-decreasing in x. Many natural distributions such as the uniform, Gaussian, and 
exponential distributions, satisfy the monotone hazard rate condition. Intuitively, these are distributions with tails no heavier than 
the exponential distribution. 
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When the load factor ij is large and the job runtimes are identically distributed, the sieve and bounded 
overload mechanism is a good approximation to the optimal makespan. The following theorems and corol- 
laries demonstrate the sieve and bounded overload mechanism under two relevant parameter settings. 

Theorem 2.4 For n jobs, m machines, and runtimes from an i.i.d. distribution, the expected makespan 
of the sieve and bounded overload mechanism with overload c = 7, partition parameter 6 = 2/3, and 
reserve (3 = ,^ ^ E[T[l:|m]] is an 0{\/\og m) approximation to the larger of the expected worst best 
and average best runtime, and hence also to the optimal makespan, on m /3 machines. Here T denotes a 
draw from the distribution on job sizes. 

Corollary 2.5 Under the assumptions of Theorem \2.4\ where additionally the distribution of job sizes has 
monotone hazard rate, the expected makespan of the sieve and bounded overload mechanism is an 0{\J\og m) 
approximation to the expected optimal makespan. 

Theorem 2.6 For n > mlogm jobs, m machines, and runtimes from an i.i.d. distribution, the expected 
makespan of the sieve and bounded overload mechanism with overload c = 7, partition parameter 5 = 
1 / log log m, and reserve 

j3 = ^j^*^^ E[T[l:|m]], is a constant approximation to the larger of the expected worst best and average 
best runtime, and hence also to the optimal makespan, on 5m/ 2 machines. Here T denotes a draw from the 
distribution on job sizes. 

Corollary 2.7 Under the assumptions of Theorem \2. 6\ where additionally the distribution of job sizes has 
monotone hazard rate the expected makespan of the sieve and bounded overload mechanism is a 0((log log m)^) 
approximation to the expected optimal makespan. 

We prove Theorem I2.2l in Section|3]and Theorems 12 .4 1 and |2. 6 1 in SectionlH 
2.2 Probabilistic Analysis 

Our goal is to show that the simple processes described by the bounded overload and sieve mechanisms 
result in good makespan and our upper bound on makespan is given by the first order statistics of each job's 
runtime across the machines. The sieve's performance analysis is additionally governed by the law of large 
numbers. We describe here basic facts about order statistics and concentration bounds. Additionally we give 
a number of new bounds, proofs of which are in Section |5] 

For random variable X and integer k, we consider the following basic constructions of k independent 
draws of the random variable. The ith order statistic, or the ith minimum of k draws, is denoted X[i:A;]. 
The first order statistic, i.e., the minimum of the k draws, is denoted The fcth order statistic, i.e., the 

maximum of k draws, is denoted X[k:k]. Finally, the sum of k draws is denoted X[I]A;]. We include the 
possibility that i ov k can be random variables. We also allow the notation to cascade, e.g., for the special 
case where the jobs are i.i.d. from F the lower bounds on OPT are T[l:m][n:n] and T[l:m][Sn]/m for the 
expected worst best and average best runtime, respectively, and T drawn from F. 

We will use the following forms of Chernoff-Hoeffding bounds in this paper. Let X = ^ • Xi, where 
Xi G [0, B] are independent random variables. Then, for all e > 1, 



Our analysis often involves relating different order statistics of a random variable (e.g. how does the 
size of a job on its best machine compare to that on its second best machine). We relate these different order 
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statistics via the stochastic dominance relation. This is useful in our analysis because stochastic dominance 
is preserved by the max and sum operators. We say that a random variable X is stochastically dominated 
by another random variable Y if for all t, Pr[X < t] > Pr[Y < t]. Stochastic dominance is equivalent to 
being able to couple the two random variables X and Y so that X is always smaller than Y. 

Below, the first lemma relates the ith order statistic over some number of draws to the first order statistic 
over half the draws. The second relates the minimum over several draws of a random variable to a single 
draw of that variable. The third relates the maximum over multiple draws of a random variable to an 
appropriate sum over those draws. These lemmas are proved in Section |5] 

Lemma 2.8 Let X be any nonnegative random variable and m and i < m be arbitrary integers. Let a 
be defined such that Pr[X < a] = 1/m (or for discontinuous distributions, a = sup{z : Pr[X < z\ < 
1/m}). Then X[i:m] is stochastically dominated by max(a, X[l:m/2] [4*:4*]). 

Lemma 2.9 For a random variable X whose distribution satisfies the monotone hazard rate condition, X 
is stochastically dominated by rX[l:r]. 

Lemma 2.10 Let Ki, • • • , Kn be independent and identically distributed integer random variables such 
that for some constant c > 1, we have Kj > c, and let Wi, ■ ■ ■ , Wn be arbitrary independent nonnegative 
variables. Then, 

E [maxj Wj[Kj:Kj]] < ^ E [Ki] E [maxj Wj] . 

We will analyze the expected makespan of a mechanism as the maximum over a number of correlated 
real-valued random variables. The correlation among these variables makes it difficult to understand and 
bound the makespan. Our approach will be to replace these random variables with an ensemble of indepen- 
dent random variables that have the same marginal distributions. Fortunately, this operation does not change 
the expected maximum by too much. Our next lemma relates the expected maximum over an arbitrary set 
of random variables to the expected maximum over a set of independent variables with the same marginal 
distributions. It is a simple extension of the correlation gap results of Aggarwal et al. [1], Yan[28 |, and 
Chawlaetal. 0. 

Lemma 2.11 Let Xi, ■ ■ ■ , Xn be arbitrary correlated real-valued random variables. Let Yi, • • • ,Yn be 
independent random variables defined so that the distribution ofYi is identical to that of Xifor all i. Then, 
E[maxj Xj] < ^ E[maxj Yj]. 

3 The bounded overload mechanism 

Recall that the bounded overload mechanism minimizes the total work subject to the additional feasibil- 
ity constraint that every machine is assigned at most cq jobs. In this section we prove that the expected 
makespan of the bounded overload mechanism, with the overload set to c = 7, is a 200?? factor approxima- 
tion to the expected best worst runtime and thus to the optimal makespan. 

Intuitively the bounded overload mechanism tries to achieve two objectives simultaneously: (1) keep 
the size of every job on the machine its schedule to be close to its size on its best machine, but also (2) 
evenly distribute the jobs across all the machines. Recall, that the minimum work mechanism achieves 
the first objective exactly, but fails on the second objective. Due to the independence between jobs, the 
number of jobs on each machine may be quite unevenly distributed. In contrast, the bounded overload 
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mechanism explicitly disallows uneven assignments of jobs and therefore the main issue to address in its 
analysis is whether it satisfies the first objective, i.e., that the sizes of the jobs are close to what they are in 
the minimum work mechanism. 

To setup for the proof of Theorem 12.21 consider the following definitions that describe the outcome of 
the bounded overload mechanism and the worst best runtime on m/2 machines (which bounds the optimal 
makespan on m/2 machines). Let Tj denote a random variable drawn according to job j's distribution of 
runtimes Fj. Let Bj denote the job's best runtime out of m/2 machines, i.e., Bj = Tj[l:m/2\, the first order 
statistic of m/2 draws. The expected worst best runtime on m/2 machines is E[maxj Bj]. The bounded 
overload mechanism considers placing each job on one of m machines. These runtimes of job j drawn i.i.d. 
from Fj impose a (uniformly random) ordering over the machines starting from the machine that is "best" 
for j to the one that is "worst"; this is j's preference list. Let Tj[r:m] denote the size of job j on the rth 
machine in this ordering (also called the job's rth favorite machine). Let Rj be a random variable to denote 
the rank of the machine that job j is placed on by the bounded overload mechanism. As each machine is 
constrained to receive at most crj jobs, the expected makespan of bounded overload is cq E[maxj Tj [Rj-.m]]. 
We will bound this quantity in terms of E[maxj Bj]. 

There are three main parts to our argument. First, we note that the RjS are correlated across different 
j's, and so are the Tj[Rj:m]s. This makes it challenging to directly analyze E[maxj Tj[i?j:m]]. We use 
Lemma [2.11l to replace the RjS in this expression by independent random variables with the same marginal 
distributions. We then show that the marginal distributions can be bounded by simple geometric random 
variables Rj. To do so, we introduce another procedure for assigning jobs to machines that we call the last 
entry procedure. The assignment of each job under the last entry procedure is no better than its assignment 
under bounded overload. On the other hand, the ranks of the machines to which jobs are allocated in the last 
entry procedure are geometric random variables with a bounded failure rate. Finally, we relate the runtimes 
Tj[Rj:m] to the optimal runtimes Bj using Lemma |Z81 

We begin by describing the last entry procedure. 

last entry In order to schedule job j, we first apply the bounded overload mechanism BOc to all jobs other 
than j. We then place j on the first machine in its preference list that has fewer than cry jobs. Let Lj 
denote the rank of the machine to which j gets allocated. 

We now make a few observations about the ranks Lj realized by the last entry procedure. 

Lemma 3.1 The runtime of any job j in bounded overload is no worse than its runtime in the last entry 
procedure. That is, Rj < Lj. 

Proof: Fix any instantiation of jobs' runtimes over machines. Consider the assignment of job j in the last 
entry procedure, and let LE(j) denote the schedule where all of the jobs but j are scheduled according 
to bounded overload and j is scheduled according to the last entry procedure. Since the bounded overload 
mechanism minimizes total work, the total runtime of all of the jobs in BOc is no more than the total runtime 
of all of the jobs in LE(j). On the other hand, the total runtime of all jobs except j in LE(j) is no more than 
the total runtime of all jobs except j in BOc. This immediately implies that j's runtime in bounded overload 
is no more than its runtime in last entry. Since this holds for any fixed instantiation of runtimes, we have 
Rj < Lj. ■ 

Next, we show that the rank Lj of a job j in last entry is stochastically dominated by a geometric random 
variable Rj that is capped at [^]. Note that Lj is at most [^] since [^] machines can accommodate 
l^lcrj > 11 jobs and therefore last entry will never have to send a job to anything worse than its [^]th 
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favorite machine. The random variable Rj also lives in {1, ... , [^] }, and is drawn independently for all j 
as follows: for i G {1, . . . , [— ] — 1}, we have Pr[Rj = i] = and the remaining probability mass is 

onr^l. 

Lemma 3.2 The rank Lj of a job j in last entry is stochastically dominated by Rj, and so the runtime of 
job j in last entry is stochastically dominated by Tj[Rj:m]. 

Proof: We use the principle of deferred decisions. In order to schedule j, the last entry procedure first runs 
bounded overload on all of the jobs other than j. This produces a schedule in which at most a ^ fraction 
of the machines have all of their slots occupied. Conditioned on this schedule, job j's preference list over 
machines is a uniformly random permutation. So the probability (over the draw of j's runtimes) that job j's 
favorite machine is fully occupied is at most 1/c. Likewise, the probability that the job's two most favorite 
machines are both occupied is at most and so on. Therefore, the rank of the machine on which j is 
eventually scheduled is dominated by a geometric random variable with failure rate 1/c. ■ Lemmas |3. II 

and l3.2l yield the following corollary. 

Corollary 3.3 For all j, the runtime Tj [Rj'.m] of job j in bounded overload is stochastically dominated by 

Tj[Rfm]. 

The benefit of relating Tj [i?j:m]s with Tj[Rj:m]?, is that while the former are correlated random vari- 
ables, the latter are independent, because the -Rj's are picked independently. CoroUarv I3.3| implies that we 
can replace the former with the latter, gaining independence, while losing only a constant factor in expected 
makespan. 

Corollary 3.4 E[maxj Tj[Rj:nn^ is no more than e/(e — 1) times E[maxj Tj[Rj:m^. 

The final part of our analysis relates the Tj [Rj :m]s to the i?j S. A natural inequality to aim for is to bound 
E[rj :m]] from above by a constant times E[i?j] for each j. Unfortunately, this is not enough for our pur- 
poses: note that our goal is to upper bound E[maxj Tj [Rj-.m]] in terms of E[maxj Bj\. Thus we proceed to 
show that Tj [Rj-.m] is stochastically dominated by a maximum among some number of copies of Bj. We ap- 
ply Lemma l2^ (stated in Section|2]and proved in Section|5]l to the random variable Tj[i:m] for this purpose. 
Define aj = sup{i : Fj{t) < 1/m}. Then the lemma shows that Tj[i:m\ is stochastically dominated by 
Taax{aj,Bj[4'-A']). 

Let Dj be defined as 4^^ . Note that E[Z)j] can be bounded by a constant whenever c > 4 (this upper 
bound is obtained by treating Rj as a geometric random variable without being capped at [— ]). Then 
Lemma [Z8] implies the following corollary. 

Lemma 3.5 Tj [Rj :m] is stochastically dominated by max(aj , Bj [Dj-.Dj]). 
We are now ready to prove the main theorem of this section. 

Theorem 12.21 For n jobs, m machines, load factor rj = n/m, and runtimes distributed according to a 
machine -symmetric product distribution, the expected makespan of the bounded overload mechanism with 
overload c = 7 is a 200r/ approximation to the expected worst best runtime, and hence also to the optimal 
makespan, on raj 2 machines. 
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Proof: The proof follows from the following series of inequalities that we explain below. First we have 
Makespan(BOc) < cryE [maxj Tj[Rj:m\] by the fact that BOc schedules at most cr/ jobs per machine 



1 



E 



max Tj[i?j:m] 
i 



< E 

< E 

< E 



max Tj [Rj-.m] 



max(max(Qj , Bj [Dj-.Dj]) ) 



max [aj + Bj[Dj:Dj]) 
j 



< max a,- + E 



maxBj[Dj:Dj] 
j 



< 2OPT1/2 + E[Dj] B[max Bj] 

The first of the inequalities follows from Lemma |X4l the second from Lemma [33] the third from noting that 
the maximum of non-negative random variables is upper bounded by their sum, and the last by the definition 
of OPTi/2, along with the fact that E[Dj] < A^. For the fifth inequality we use Lemma 12.101 to bound 
the second term. For the first tenii in that inequality consider the job j that has the largest aj. For this 
job, the probability that its size on all of the m/2 machines in OPT1/2 is at least aj is (1 — Fj{aj))"^^'^ > 
(1 - l/m)"^/2 > 1/2 by the definition of aj. So OPT1/2 > max^ aj/2. 

The final approximation factor therefore is cq-^^ ^2 + 8^5i^ for all c > 4. At c = 7, this evaluates to 
a factor 200r/ approximation. ■ 



4 The sieve and bounded overload mechanism 

We will now analyze the performance of the sieve and bounded overload mechanisms under the assumption 
that the jobs are a priori identical. Let us consider the sieve mechanism first. Recall that this is essentially 
the minimum work mechanism where every job is assigned to its best machine, except that jobs with a size 
larger than /3 on every machine are left unscheduled. The bound of j3 on the size of scheduled jobs allows 
us to employ concentration results to bound the expected makespan of the mechanism. Changing the value 
of /? allows us to tradeoff the makespan of the mechanism with the number of unscheduled jobs. 

Lemma 4.1 For k < log m, the expected makespan of the sieve mechanism with j3 = "^^^^'""^^ is no more 
than 0{logm/k) times the expected average best runtime, and hence also the expected optimal makespan. 
The expected number of jobs left unscheduled by the mechanism is km. 

Proof: Let us first consider the expected total work of any single machine, that is the expected total size 
of jobs scheduled on that machine. Let Yij be a random variable that takes on the value if job j is not 
scheduled on machine i, and takes on the size of j on machine i if the job is scheduled on that machine. 
The probability that j is scheduled on i is no more than 1/m; its expected size on i conditioned on being 
scheduled is at most r = E[T[l:m]]. Therefore, E[^^ Yij] < which in turn is at most the average best 
runtime. 
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Note that the l^j's are independent and bounded random variables. So we can apply Chernoff-Hoeffding 
bounds and use /5 = to get 



^ '^^V \ 3 k I3m 



Taking the union bound over the m machines, we get that with probability 1 — 1/m, the makespan of the 
sieve mechanism is at most 0{\ogm/k) times OPT. 

We will now convert this tail probability into a bound on the expected makespan. Let 7 denote the factor 
by which the expected makespan of the mechanism exceeds OPT. Remove all jobs with best runtimes 
greater than f3 from consideration and consider creating sieve's schedule by assigning each of the leftover 
jobs to their best machine (minimizing total work) one-by-one in decreasing order of best runtime, until the 
makespan exceeds | log m times OPT. This event happens with a probability at most 1/m. When this event 
happens, we are left with a smaller set of jobs; conditioned on being left over at this point, these jobs have a 
smaller best runtime than the average over all scheduled jobs. Thus the expected makespan for scheduling 
them will be at most 7OPT. So we get 7 < 7\ogm/k + 7/m, i.e., 7 = 0(log m/k). This implies the first 
part of the lemma. 

We now prove the second part of the lemma, i.e., the expected number of jobs left unscheduled is 
km. Note that {3 exceeds a job's expected best runtime by a factor of n/km. Thus by applying Markov's 
inequality, we get the probability of a job's best runtime being larger than j3 to be at most km/n. Hence the 
expected number of jobs with best runtime larger than /? is km. ■ 

Next we will combine the sieve mechanism with the bounded overload mechanism. We consider two 
different choices of parameters. Note that if in expectation the sieve mechanism leaves km jobs unscheduled, 
using the bounded overload mechanism to schedule these jobs over a set of il(m) machines gives us an 
expected makespan that is at most 0{k) larger than the expected optimal makespan on that number of 
machines. In order to balance this with the makespan achieved by sieve, we pick k = \/\og m. This gives 
us Theorem [ 



Theorem 12.41 For njobs, m machines, and runtimes from an i.i.d. distribution, the expected makespan of 
the sieve and bounded overload mechanism with overload c = 7, partition parameter 5 = 2/3, and reserve 
13 = ^i^g^ E[T[l:|m]], is an 0{^/\og m) approximation to the larger of the worst best runtime and the 
average best runtime, and hence also to the optimal makespan, on m /3 machines. Here T denotes a draw 
from the distribution on job sizes. 

Proof: For the choice of parameters in the theorem statement, we use m/3 of the m machines for the 
sieve mechanism, and the remainder for the bounded overload mechanism. The expected makespan of the 
overall mechanism is no more than the sum of the expected makespans of the two constituent mechanisms. 
Lemma 14.11 implies that the expected makespan of the sieve mechanism is 
0{^/\og m) times OPT1/3, and the load factor for the bounded overload mechanism is also 0{^J\ogm). 
Theorem l2.2l then implies that the expected makespan of the bounded overload mechanism is also 0{\J\og m) 
times OPT1/3. ■ 

If we partition the machines across the sieve and the bounded overload mechanisms roughly equally, 
then Theorem 12.41 gives us the optimal choice for the parameter (3. A different possibility is to perform a 
more aggressive screening of jobs by using a smaller (3, while comparing our performance against a more 
heavily penalized optimal mechanism - one that is allowed to use only a 5 fraction of the machines. 
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Theorem 12.61 For n > mlogm jobs, m machines, and runtimes from an i.i.d. distribution, the expected 
makespan of the sieve and bounded overload mechanism with overload c = 7, partition parameter S = 
1 / log log m, and reserve /3 = ^^^^^ E[T[1: |m]], is a constant approximation to the larger of the worst 
best runtime and the average best runtime, and hence also to the optimal makespan, on 6m/ 2 machines. 
Here T denotes a draw from the distribution on job sizes. 



Proof: We will show that the expected makespan of the sieve mechanism is at most a constant times the 
average best runtime on 5m/2 machines, and the expected number of unscheduled jobs is 0{5m). The 
current theorem then follows by applying Theorem [ 



Let us analyze the expected makespan of the sieve mechanism first. Let r = E[r[l:|m]]. Then we can 



bound OPT5/2 as OPT5/2 



> 



As in the proof of Lemma 1470 let Yij be a random variable that takes 
on the value if job j is not scheduled on machine i, and takes on the size of j on machine i if the job is 
scheduled on that machine. Then, 



E 



[E, M ^ (T^ E[r[l:(l - 5)m]] < ^ < 



Applying Chernoff-Hoeffding bounds we get 



Pr [V 



Yij > 2OPT5/2 



< Pr [Y,^ y,, > 4(1/5 - 1) E y,, 



< exp 



1 ; 

'5(1- 



< m 



-1/25 



Here we used j3 = 2nr / m log m. Taking the union bound over the m machines, we get that with probability 
0(1), the makespan of the sieve mechanism is at most twice OPT5/2- Once again, as in the proof of 
Lemma |4T] we can convert this tail bound into a constant factor bound on the expected makespan. 

Now let us consider the jobs left unscheduled. For any given job, we will compute the probability that 
its runtime on all of the (1 — 5)m machines is larger than /?. Because /3 is defined in terms of T[l:|m], we 
will consider the machines in batches of size 5m/ 2 at a time. Using Markov's inequality, the probability that 
the job's runtime exceeds j3 on all machines in a single batch is at most mlogm/2n. There are 2{l/5 — 1) 
batches in all, so the probability that a job remains unscheduled is at most (mlogm/n)(2^(^~^/'^)), which 
by our choice of 5 is 0{5m/n). ■ 



5 Deferred proofs 

In this section we prove the bounds for random variables and order statistics from Section 12.21 

Lemma [2.81 Let X be any nonnegative random variable, and m, i < m be arbitrary integers. Let a be 
defined such that Pr[X < a] = 1/m (or for discontinuous distributions, a = sup{z : Pr[X < z\ < 
1/m}). Then X[i:m] is stochastically dominated by max{a, X[l:m/2] [4*:4*]). 

Proof: Let F be the cumulative distribution function of X. We prove this by showing that X [i:m] is "almost" 
stochastically dominated by X[l:m/2][4*:4*]; specifically, we show that for all t > a, 

Pr [X[i:m] > t] < Pr [X[l:m/2] [4^4'^] > t] . 

To prove this inequality, we will define aprocess for instantiating the variables X[i:m] and X[l:m/2][4*:4*] 
in a correlated fashion such that the former is always larger than the other. 
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X[l:m/2][4*:4*] is a statistic based on 4*m/2 independent draws of the random variable X. Consider 
partitioning these draws into groups of size m each. We then randomly split each group into two smaller 
groups, which we will refer to as blocks, of size m/2 each. Define a good event G to be the event that at 
least one of these 4*/2 groups get split such that the i smallest runtimes in it all fall into the same block. If 
event Q occurs, arbitrarily choose one group which caused event Q, and for all k define X[k:m] to be the 
kth min from this group. Otherwise, select an arbitrary group to define the X[k:m]. Note that since we split 
the groups into blocks randomly, and this is independent of the drawn runtimes in the groups, X[k:m\ has 
the correct distribution, both when Q occurs and does not occur. Define the minimum from each of the 4* 
blocks to be a draw of X[l:m/2]. Thus, whenever Q occurs, the probability that the X[l:m/2] [4*:4*] > t is 
at least the probability that X[i + l:m] > t. We have that 

Pr [X[l:m/2][4*:4*] > > Pr [G] ■ Pr [X[i + l:m] > t] 
We now show that ( Pr [Q] ■ ^p^ixt^'T^l^ i — ^ whenever F{t) > 1/m, which completes our proof of the 



m~k 



Pr[X[i:m]>t] 

lemma. Note that 



Pr + l:m] > t] _ El=o " m) 



Pr [X[i:m] > t] ^-J^ {"^)F{t)'^{l - F(t))— ^ 

1)Fityil- Fit))""-' 



1 + 



Er=o(T)^w'(i-^w)'"-'' 



which we can see is an increasing function of F{t). Thus in the range F{t) > 1/m, it attains its minimum 
precisely at F{t) = 1/m. Substituting F{t) = 1/m into the above, and using standard approximations for 
(T) (namely (f)^<(™)<(^)^ we have 

Pr [X[^ + l:m] >t]^^^ (f )^ [^Y (l " 



Pr \X\i:m] > t] - ' ' ^"^ , , 

(i-^r+E(T)'(^r(i-^r" 



fc=i 



1 + (i - 1) • maxfcd - ^ ' l + {i-l)e' 



It suffices to show that this last quantity, when multiplied with Pr[^], is at least 1. We consider the com- 
plement of event Q, call it even B. The event B occurs only when none of the 4Y2 groups split favorably. 

The probabiUty that a group splits favorably (for i > 1) is 2 • (J^2-J / (^2) ^ 2~(*-i). So we can 

see that Y>v[B] < (1 - 2-(*~i))4V2 < e-(4/2)\ and thus Pr[g] > 1 - e~(^/2)\ ^an be verified that 



Lemma [2.91 For a random variable X whose distribution satisfies the monotone hazard rate condition, X 
is stochastically dominated by rX[l:r]. 
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Proof: The hazard rate function is related to the cumulative distribution function as Pr [X >t]=e ^^^^ '^^ 
Likewise, we can write: 

Pr[rX[l:r] > t] = Pr[X[l:r] > t/r] = (e" /o'''' '^(^) '^^)^ = g"^ /o '^(^) '^^ 

In order to prove the lemma, we need only show that Jg h{z) dz > r ■ f^^^ h{z) dz. Since the hazard rate 
function h{z) is monotone, the function h{z) dz is a convex function of t. The required inequality follows 
from the definition of convexity. ■ 



Lemma [2.101 Let Ki, ■ ■ ■ ,Kn be independent and identically distributed integer random variables such 
that for some constant c > 1, we have Kj > c for all j, and let W\ , • • • , Wn be arbitrary independent 
nonnegative variables. Then, 

E [max, W,[Kf.K,]] < ^ E [K^] E [max, Wj] . 

Proof: We consider the following process for generating correlated samples for maXj Wj and max, Wj [Kj'.Kj]. 
We first independently instantiate Kj for every j; recall that these are identically distributed variables. Let 
k = Kj > cn. Then we consider all possible n\ permutations of these instantiated values. For each 
permutation a, we make the corresponding number of independent draws of the random variable Wj for all 
j; call this set of draws X^^. In all, we get kn\ draws from the distributions, that is, | X^l = kn\. Exactly 
A;(n — 1)! of these draws belong to any particular j; denote these by Yj. 

Now, the maximum element out of each of the sets is an independent draw from the same distribution 
maxj Wj[Kj:Kj] is drawn from. We get n\ independent samples from that distribution. Call this set of 
samples X. 

Next note that each set Yj contains k{n — 1)1 independent draws from the distribution coixesponding 
to Wj. We construct a uniformly random n-dimensional matching over the sets Yj, and from each n-tuple 
in this matching we pick the maximum. Each such maximum is an independent draw from the distribution 
corresponding to max, Wj, and we get k{n — l)\ such samples; call this set of samples Y. 

Finally, we claim that ^[J2yeY 2/] ^ (1 ~ ^lYlxex expectation taken over the ran- 

domness in generating the n-dimensional matching across the YjS. The lemma follows, since we have 
^[Sxgx ^] ~ E[maxj Wj[Kj:Kj]] as well as 

E[^y]= E [k{n - l)\E[maxWj]] = n\E[Kj]E[maxWj]. 

y&Y ^^^^ ^ ^ 

To prove the claim, we call an x G X "good" if the n-tuple in the matching over {1^ } that it belongs to does 
not contain any other element of X. Then, E[^^gy y] > E[J2x£X ^ is "good"]]. 

Let us compute the probability that some x is "good". Without loss of generality, suppose that x € Yi. 
In order for x to be good, it's n-tuple must not contain any of the other elements of X from the other Y^ 's. 
If we define Xj = |X n then Pr[2; is "good"] is at least nj^i(l ~ k{n-i)0 ^^ere ^ Xj < nl. This 
product is minimized when we set one of the XjS to n! and the rest to 0, and takes on a minimum value of 
1 -n/Zc > 1 - 1/c. ■ 



Lemma [2.1 II Let Xi, ■ ■ ■ , Xn be arbitrary correlated real-valued random variables. Let Yi, • • • ,1^ be 
independent random variables defined so that the distribution ofYi is identical to that of Xifor all i. Then, 
E[maxj Xj] < E[maxj Y.j\. 



16 



Proof: We use the following result from |[T1 (also implicit in ||5]). Let [/ be a universe of n elements, / a 
monotone increasing submodular function over subsets of this universe, and D a distribution over subsets 
of U. Let Z) be a product distribution (that is, every element is picked independently to draw a set from this 
distribution) such that ^vsM^ ^S]= Pr^^^li G S]. Then Es^d[/(5)] < ^ ^s^b[f\S)\. 

To apply this theorem, let us first assume that the variables Xi are discrete random variables over a finite 
domain. The universe U will then have one element for each possible instantiation of each variable Xi with 
a value equal to that instantiation. Then any joint instantiation of the variables Xi, - ■ ■ , X„ corresponds to 
a subset of [/; let D denote the corresponding distribution over subsets. Let / be the max function over 
the instantiated subset. Then E[maxj Xj] is exactly equal to 'Esr^D[f{S)]- As before, let D denote the 
distribution over subsets of U where each element is picked independently. Likewise, the random variables 
Yi , • • • , y„ define a distribution, say D' , over subsets of U . Note that under D' the memberships of elements 
of U in the instantiated subset are negatively correlated - for two elements that correspond to instantiations 
of the same variable, including one in the subset implies that the other is not included. This raises the 
expected maximum. In other words, 'Es^D'[f{S)\ > E^^^[/(5)]. Therefore, we get E[maxjXj] = 
^S^D[f{S)] < (e/e - l)Bs^D'[f{S)] = (e/e - l)E[m^^,Yj]. 

When the variables Xj are defined over a continuous but bounded domain, we can apply the above 
argument to an arbitrarily fine discretization of the variables. Our claim then follows from taking the limit 
as the granularity of the discretization goes to zero. 

Finally, let us address the boundedness assumption. For some e < let i? be defined so that for 

all i, Pr[Xi > B] < e. Then the contribution to the expected maximum from values above B is similar 
for the Xs and the Ys: the probability that some variable Xi attains the maximum value b > B is at most 
Pr[Xi = b] whereas the probability that the variable Kj attains the maximum value b > B is at least 
(1 - e)""-^ Pr[Yi = b]. Therefore, E[maxj Xj] < (1 + o(e))(e/e - 1) E[maxj Yj]. Taking the limit as e 
goes to zero impUes the theorem. ■ 



Comparing OPT and OPT5 We now prove Lemma |2?T] The key intuition behind the lemma is that it 
can be viewed as the result of scaling both sides of the stochastic dominance relation of Lemma 12.91 up 
by a constant, and as we shall see, the monotone hazard rate condition is retained by the minimum among 
multiple draws from a probability distribution. 

Lemma [2. II When the distributions of job sizes have monotone hazard rates the expected worst best and 
average best runtimes on 6m machines are no more than 1 /6'^ times the expected worst best and average 
best runtimes respectively on m machines. 

Proof: We will show that the random variable Tj[l:(5m] is stochastically dominated by ^Tj[l:m]. Then, the 
expected worst best runtime with Sm machines is no more than 1/6 times the expected worst best runtime 
with m machines. Likewise, the expected average best runtime with 6m machines is no more than 1/5^ 
times the expected average best runtime with m machines. (The extra 1/6 factor comes about because we 
average over 6m machines for the former, versus over m machines for for the latter.) 

Our desired stochastic dominance relation is precisely of the form given by Lemma 12.91 In particular, 
observe that taking a minimum among m draws is exactly the same as first splitting the m draws into 1/6 
groups, selecting the minimum from each group of 6m draws, and then taking the minimum from this 
collection of 1/6 values. Thus, we can see that {l/6)Tj[l:m] = {l/6)Tj[l:6m][l:l/6], and so the claim 
follows immediately from Lemma fL9\ as long as the distribution of Tj[l:6m] has a monotone hazard rate. 
We show in Claim[I]below that the first order statistic of i.i.d. monotone hazard rate distributions also has a 
monotone hazard rate. ■ 
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Claim 1 A distribution F has a monotone hazard rate if and only if the distribution of the minimum among 
k draws from F has a monotone hazard rate. 

Proof: Let F^ denote the cdf for minimum among n draws from F. Then we have -Pfe (x) = 1 — {1 — F{x))'^, 
and the corresponding fk{x) = k{l — F{x))''~^ f{x). Thus the hazard rate function is: 

, . ^ fkjx) _ k{l-F{x)f-^f{x) _ fix) 
i^kyx) — --. „ / N — jz J-,, Nxi. — fi- 



l-Fk{x) {\-F{x)f l-F{xy 

This is precisely k times the hazard rate function h{x), and therefore, /ifc(x) is monotone increasing if and 
only if h{x) is. ■ 



6 Conclusions 

Non-Unear objectives coupled with multi-dimensional preferences present a significant challenge in mecha- 
nism design. Our work shows that this challenge can be overcome for the makespan objective when agents 
(machines) are a priori identical. This suggests a number of interesting directions for follow-up. Is the gap 
between the first-best and second-best solutions (i.e. the cost of incentive compatibility) still small when 
agents are not identical? Does knowledge of the prior help? Note that this question is meaningful even if 
we ignore computational efficiency. On the other hand, even if the gap is small, the optimal incentive com- 
patible mechanism may be too complex to find or implement. In that case, can we approximate the optimal 
incentive compatible mechanism in polynomial time? 

Similar questions can be asked for other non-Unear objectives. One particularly interesting objective is 
max-min fairness, or in the context of scheduling, maximizing the running time of the least loaded machine. 
Unlike for makespan, in this case we cannot simply "discard" a machine (that is, schedule no jobs on it) 
without hurting the objective. This necessitates techniques different from the ones developed in this paper. 
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